What is claimed is: 



Claims 



5 1. A method comprising the steps of: 

creating an evaluation model from at least one evaluation phone; 
creating a synthesizer model from at least one synthesizer phone; and 
determining a matrix from the evaluation and synthesizer models. 

10 2. The method of claim 1 : 

wherein the at least one evaluation phone comprises a first plurahty of 
evaluation phones, the at least one synthesizer phone comprises a first plurality of 
synthesizer phones; and 

wherein the method further comprises the steps of: 
1 5 creating a new matrix by subtracting the matrix from an identity matrix; 

creating an intermediate matrix comprising the new matrix and a second 
identity matrix; 

determining a first set of specific elements of the intermediate matrix; and 
determining acoustic conflisabihty from one of the specific elements. 

20 

3. The method of claim 2, further comprising the steps of: 

creating a second evaluation model comprising the first plurality of 

evaluation phones and additional evaluation phones; 

creating a second matrix from the second evaluation model and the 
25 synthesizer model; 

creating a second new matrix by subtracting the second matrix from a third 

identity matrix; 
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creating a second intermediate matrix comprising the second new matrix 
and a fourth identity matrix; 

determining a second set of specific elements of the intermediate matrix, 
the specific elements corresponding to a column of the second intermediate matrix, 
wherein the second set of specific elements comprise the first set of specific elements and 
a new set of specific elements; and 

determining a second acoustic confusability by using previously performed 
calculations of the first set of elements and by calculating the new set of specific 
elements. 

4. The method of claim 1, wherein the evaluation model comprises a hidden 
Markov model of the at least one evaluation phone and wherein the synthesizer model 
comprises a hidden Markov model of the at least one synthesizer phone. 

5. The method of claim 4, wherein at least one of the hidden Markov models 
comprises a plurality of states and a plurality of transitions between states, wherein at 
least one of the transitions is a transition firom one of the states to itself, wherein at least 
one of the transitions is a transition fi*om one of the states to another of the states, wherein 
each transition has a transition probability associated with it, and wherein each state has a 
probability density associated with it. 

6. The method of claim 5, wherein the plurality of states comprises a starting 
state, an ending state and an intermediate state, wherein the plurahty of transitions 
comprise: 

a transition fi:om the starting state to itself; 

a transition from the starting state to the intermediate state; 

a transition fi:om the intermediate state to itself; 
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a transition from the intermediate state to the ending state; and 
a transition from the ending state to itself. 



7. The method of claim 1, further comprising the steps of: 

5 creating a new matrix by subtracting the matrix from an identity matrix; 

determining an inverse of the new matrix by the following steps: 

creating an intermediate matrix comprising the new matrix 
and a second identity matrix; 

determining a specific entry of the second identity matrix 
10 that corresponds to acoustic confusability; 

determining a specific column or row in which the specific 
entry resides; and 

performing column or row manipulations to create a third 
identity matrix in the new matrix while calculating only entries of the 
1 5 specific column or row in the second identity matrix; and 

selecting the specific entry as the acoustic confiisability. 



8. The method of claim 1, further comprising the steps of: 

creating a new matrix by subtracting the matrix from an identity matrix. 
20 determining an inverse of the new matrix; and 

determining acoustic confiisability by using the inverse of the new matrix. 



9. The method of claim 8, wherein the step of determining acoustic 

confusability by using the inverse of the new matrix comprises the step of selecting one 
25 element of the inverse of the new matrix as the acoustic confusability. 
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10. The method of claim 5, wherein the step of determining a matrix from the 

evaluation and synthesizer models comprises the steps of; 

determining a plurality of product machine states; and 

determining a plurality of product machine transitions between the product 

5 machine states. 



1 1 . The method of claim 1 0, wherein: 

each of the product machine states corresponds to one of the states of the 
evaluation model and one of the states of the synthesizer model; 
10 each of the product machine transitions connects one of the product 

machine states to the same or another product machine state; and 

a product machine transition exists when one or both of the following are 
true: a transition connects one evaluation model state with the same or another evaluation 
model state and a transition connects one synthesizer model state with the same or 
1 5 another synthesizer model state. 

12. The method of claim 10, wherein the step of determining a matrix from 
the evaluation and synthesizer models fiirther comprises the steps of: 

determining a product machine transition probability for each of the 
20 plurality of product machine transitions; and 

determining a synthetic Ukehhood for each of the product machine states. 

13. The method of claim 10, wherein the matrix comprises a plurality of 
elements and wherein each element of the matrix corresponds to a potential transition 

25 between two of the product machine states. 
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14. The method of claim 13, wherein the step of determining a matrix from 

the evaluation and synthesizer models further comprises the steps of: 
selecting an element of the matrix; 

assigning a probability to the element if a product machine transition 
5 exists between two product machine states corresponding to a potential transition that 
corresponds to the element, else assigning a zero to the element; and 

continuing the steps of selecting and assigning until each element of the 
matrix has been assigned. 

10 15. A method comprising the steps of: 

a) creating an evaluation model from a plurahty of evaluation phones, 
each of the phones corresponding to a first word; 

b) creating a synthesizer model from a plurality of synthesizer phones, 
each of the phones corresponding to a second word; 

15 c) creating a product machine from the evaluation model and 

synthesizer model, the product machine comprising a plurality of transitions and a 
plurality of states; 

d) determining a matrix from the product machine; and 

e) determining acoustic confiisability of the first word and the second 
20 word by using the matrix. 

16. The method of claim 15, wherein each of the evaluation and synthesizer 

models comprises a hidden Markov model. 

25 17. The method of claim 16, fiuther comprising the step of determining 

synthetic likelihoods for each of the plurality of product machine states. 
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18. The method of claim 17, wherein each synthetic hkelihood is a measure of 

the acoustic confusabiHty of two specific observation densities associated with the hidden 
Markov models of the evaluation and synthesizer models. 

5 19. The method of claim 17, wherein the synthetic likehhoods are compressed 

by normalization. 

20. The method of claim 17, wherein the synthetic likelihoods are compressed 
by ranking. 

10 

21. The method of claim 17, wherein all synthetic likelihoods are determined 
through a method selected fi-om the group consisting essentially of a cross-entropy 
measure, a dominance measure, a decoder measure, and an empirical measure. 

15 22. The method of claim 15, further comprising the steps of: 

f) performing steps (a) through (e) for a plurality of word pairs, each 
word pair comprising evaluation and synthesizer models, thereby determining a plurality 
of acoustic confusabilities; and 

g) determining acoustic perplexity by using the plurality of acoustic 

20 confusabilities. 

23. The method of claim 15, further comprising the steps of: 

f) performing steps (a) through (e) for a plurality of word pairs, each 
word pair comprising evaluation and synthesizer models, thereby determining a plurality 

25 of acoustic confusabilities; and 

g) determining synthetic acoustic word error rate by using the 
plurality of acoustic confusabilities. 
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A method comprising the steps of: 

a) determining acoustic confusabihty for each of a plurahty of word 



pairs; and 

b) determining a metric by using the acoustic confusabihties. 

25. The method of claim 24, wherein step (b) further comprises the step of 
determining an acoustic perplexity by using the confiisabilities. 

26. The method of claim 25, further comprising the steps of: 

c) performing steps (a) and (b) to determine an acoustic perplexity of 
a base bigram language model; 

d) performing steps (a) and (b) to determine an acoustic perplexity of 
an augmented language model; and 

e) determining gain comprising a logarithm of a fraction determined 
by dividing the acoustic perplexity of the augmented language model by the acoustic 
perplexity of the base bigram language model. 

27. The method of claim 25, further comprising the step of: 

c) minimizing acoustic perplexity during training of a language 

model. 

28. The method of claim 27, wherein step (c) further comprises the step of 
maximizing a negative logarithm of the acoustic perplexity. 

29. The method of claim 24, wherein step (b) further comprises the step of 
determining a Synthetic Acoustic Word Error Rate (SAWER) by using the 
confiisabilities. 
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30. The method of claim 29, further comprising the steps of: 

c) performing steps (a) and (b) to determine a SAWER of a base 
bigram language model; 

d) performing steps (a) and (b) to determine a SAWER of an 
5 augmented language model; and 

e) determining an improvement comprising a difference between the 
SAWER of the augmented language model and the SAWER of the base bigram language 
model. 

10 31. The method of claim 29, further comprising the step of: 

c) minimizing the SAWER during training of a language model. 

32. The method of claim 3 1 , wherein step (c) further comprises the step of 
maximizing one minus the SAWER. 

15 

33. The method of claim 29, further comprising the steps of: 

c) performing steps (a) and (b) to determine a SAWER for a 

vocabulary; 

d) augmenting the vocabulary with at least one additional word; 

20 e) performing steps (a) and (b) to determine a SAWER for the 

augmented vocabulary; and 

f) determining an improvement comprising a difference between the 
SAWER for the vocabulary and the SAWER for the augmented vocabulary. 

25 34. The method of claim 33, further comprising the steps of: 

g) performing steps (d) through (f) for a plurality of additional words; 

h) determining a particular word of the additional words that has the 
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best improvement; and 

i) adding the particular word to the vocabulary, 

35. The method of claim 24, wherein each of the words of the word pairs is 
represented by a hidden Markov model, and wherein step (a) further comprises the steps 
of: 

creating a product machine for each of the plurality of word pairs, wherein 
each word each product machine comprising a plurality of states and a plurality of 
transitions determined by the hidden Markov models of a corresponding word pair; and 

for each product machine, determining synthetic likelihoods for each of 
the plurality of product machine states. 

36. The method of claim 35, wherein each synthetic likelihood is a measure of 
the acoustic confusability of two specific observation densities associated with the hidden 
Markov models of the corresponding word pair. 

37. The method of claim 35, wherein the synthetic likelihoods are compressed 
by normahzation. 

38. The method of claim 35, wherein the synthetic likelihoods are compressed 
by ranking. 

39. The method of claim 35, wherein all synthetic likelihoods are determined 
through a method selected from the group consisting essentially of a cross-entropy 
measure, a dominance measure, a decoder measure, and an empirical measure. 
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40. The method of claim 35: 

wherein step (a) further comprises the step of, for each acoustic 

confusabihty: 

determining a matrix jfrom a corresponding product 

5 machine; and 

determining an inverse of a second matrix created by 
subtracting the matrix from an identity matrix; and 
wherein each hidden Markov model comprises a plurality of phones; 
wherein a larger word and a smaller word have an identical sequence of 

10 phones; 

wherein the larger of the two words comprises an additional set of phones; 

and 

wherein a set of calculations performed when determining the inverse of 
the matrix for the smaller word is cached and used again when determining the inverse of 
15 the matrix for the larger word. 

41. The method of claim 24, wherein step (a) further comprises the steps of, 
for each of the word pairs: 

determining an edit distance between each word of the word pair; and 
20 determining acoustic confusabihty from the edit distance. 



42. The method of claim 41, wherein the edit distance is determined by 
determining a number of operations and a type of each operation to change one word of 
the word pair into the other word of the word pair. 

25 

43. The method of claim 42, wherein the operations are selected from the 
group consisting essentially of deletions, substitutions and additions of phones. 
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44. The method of claim 42, further comprising the step of weighting each 

operation. 

5 45. The method of claim 42, fixrther comprising the step of assigning a cost to 

each operation. 

46. A method for determining acoustic confusability of a word pair, the 
method comprising the steps of: 

10 determining an edit distance between each word of the word pair; and 

determining acoustic confusability from the edit distance. 

47. The method of claim 46, wherein the edit distance is determined by 
determining a number of operations and a type of each operation to change one word of 

1 5 the word pair into the other word of the word pair. 

48. The method of claim 47, wherein the operations are selected from the 
group consisting essentially of deletions, substitutions and additions of phones. 

20 49. The method of claim 47, further comprising the step of weighting each 

operation. 

50. The method of claim 47, further comprising the step of assigning a cost to 

each operation. 

25 
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51. A system comprising: 

a memory that stores computer-readable code; and 

a processor operatively coupled to said memory, said processor configured 
to implement said computer-readable code^ said computer-readable code configured to: 
creating an evaluation model from at least one evaluation phone; 
creating a synthesizer model fi"om at least one synthesizer phone; and 
determining a matrix from the evaluation and synthesizer models. 

52. A system comprising: 

a memory that stores computer-readable code; and 

a processor operatively coupled to said memory, said processor configured 
to implement said computer-readable code, said computer-readable code configured to: 

a) determine acoustic confiisability for each of a plurality of word 

pairs; and 

b) determine a metric by using the acoustic confiisabilities. 

53. The system of claim 52, wherein the computer-readable code is further 
configured, when performing step (b), to determine an acoustic perplexity by using the 
confusabilities. 

54. The system of claim 52, wherein the computer-readable code is further 
configured, when performing step (b), to determine a Synthetic Acoustic Word Error Rate 
(SAWER) by using the confusabilities. 
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55. A system for determining acoustic confusability of a word pair, the system 
comprising: 

a memory that stores computer-readable code; and 

a processor operatively coupled to said memory, said processor configured 
to implement said computer-readable code, said computer-readable code configured to: 
determine an edit distance between each word of the word pair; and 
determine acoustic confusability from the edit distance. 

56. An article of manufacture comprising: 

a computer-readable medium having computer-readable code means 
embodied thereon, the computer-readable program code means comprising: 

a step to creating an evaluation model from at least one evaluation phone; 
a step to creating a synthesizer model from at least one synthesizer phone; 

and 

a step to determining a matrix from the evaluation and synthesizer models. 

57. An article of manufacture comprising: 

a computer-readable medium having computer-readable code means 
embodied thereon, the computer-readable program code means comprising: 

a) a step to determine acoustic confrisability for each of a plurality of 
word pairs; and 

b) a step to determine a metric by using the acoustic confiisabilities. 

58. The article of manufacture of claim 57, wherein the computer-readable 
program code means further comprises, when performing step (b), a step to determine an 
acoustic perplexity by using the conftisabilities. 



YOR920000210US2 



-80- 



59. The article of manufacture of claim 57, wherein the computer-readable 

program code means further comprises, when performing step (b), a step to determine a 
Synthetic Acoustic Word Error Rate (SAWER) by using the confusabilities. 

5 60. An article of manufacture for determining acoustic confusability of a word 

pair, the article of manufacture comprising: 

a computer-readable medium having computer-readable code means 
embodied thereon, the computer-readable program code means comprising: 

determine an edit distance between each word of the word pair; and 
10 determine acoustic confusabihty from the edit distance 



YOR920000210US2 



-81- 



